generalize crate for multi-device PCIe passthrough#1573
Conversation
Signed-off-by: Patrick Riel <priel@nvidia.com>
|
@elezar adding this for your review as well since it's adjacent to gpu work |
…ng restart reconciliation, without rebinding or mutating sysfs. Signed-off-by: Patrick Riel <priel@nvidia.com>
| let vendor = read_sysfs_trimmed(&dev_dir.join("vendor"))?; | ||
| if vendor != NVIDIA_VENDOR_ID { | ||
| return Err(VfioError::NotNvidia { | ||
| bdf: bdf.to_string(), | ||
| vendor, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Question: Is this code NVIDIA-specific? If so, we may want to update the function name.
Signed-off-by: Patrick Riel <priel@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
a112488 to
656eaf2
Compare
|
I worked through this locally with Codex, and after Codex called out some The bug is in the cached-ID release path: The proposed commit makes both BDF-based and cached-ID deregistration go through Separately, Codex pointed out that another rollback boundary in The function unbinds the current host driver before writing I think this is worth treating as a sysfs state-transition boundary rather than A representative regression test would be: #[test]
fn test_bind_failure_reprobes_after_driver_override_failure() {
let _refcount_guard = test_refcounts::guard();
test_refcounts::clear("10de 26bb");
let (tmp, sysfs) = setup_mock_sysfs();
let bdf = "0000:2d:00.0";
create_pci_device(&sysfs, tmp.path(), bdf, "0x10de", "0x26bb", "0x030000", 42);
create_new_id_file(&sysfs);
create_remove_id_file(&sysfs);
create_probe_file(&sysfs);
set_mock_driver(&sysfs, bdf, "nvidia");
// Force driver_override to fail after the host driver has been unbound.
fs::create_dir(sysfs.pci_device(bdf).join("driver_override")).unwrap();
let err = bind_device_to_vfio(&sysfs, bdf).unwrap_err();
assert!(matches!(err, VfioError::BindFailed { .. }));
let probed = fs::read_to_string(sysfs.drivers_probe()).unwrap();
assert_eq!(
probed, bdf,
"driver_override failure happens after unbind, so rollback should re-probe the host driver"
);
}On the current PR, this fails because |
elezar
left a comment
There was a problem hiding this comment.
Thanks @cheese-head. Feel free to drop the commit I suggested if you think it doesn't go in the right direction.
| let override_path = sysfs.pci_device_ref(bdf).driver_override_path(); | ||
| if let Err(e) = write_sysfs(&override_path, "vfio-pci") { | ||
| deregister_vfio_new_id(sysfs, bdf); | ||
| return Err(VfioError::BindFailed { |
There was a problem hiding this comment.
should we also call cleanup_partial_bind for this error branch as well?
Summary
Generalize
openshell-vfiobeyond GPU-only / single-device passthrough so it can serve as the binding/validation primitive layer the VM driver needs to implement RFC-0004'sresource_requirementsmodel. Adds atomic IOMMU-group binding, dry-run validation forValidateSandboxCreate-style paths, class-agnostic device enumeration, and a correctness fix for partially-failed binds. Purely additive plus one bug fix; consumer crates are unchanged.Related Issue
Foundational for RFC-0004 sandbox resource requirements (#1360). Unblocks multi-device-per-sandbox passthrough that the existing single-device API could not express when devices shared an IOMMU group (consumer GPUs + HDA + USB-C, multi-PF NICs, devices behind ACS-deficient PCIe switches).
Changes
prepare_pci_group_for_passthrough/release_pci_group_from_passthroughfor atomic bind/release of multiple PCIe devices sharing one IOMMU group. Rollback only restores devices newly bound by the call, so it does not steal bindings owned by other guards.validate_pci_for_passthrough/validate_pci_group_for_passthroughas dry-run pre-flight checks forValidateSandboxCreate-style paths. Performs every structural and IOMMU-peer check without touchingdriver_overrideor any other kernel state.prepare_*now delegates to its validate counterpart to keep the two in lockstep.probe_host_vfio_candidates(sysfs, vendor_filter)for vendor-filtered, class-agnostic enumeration of passthrough-eligible PCI devices so consumers can advertiseDeviceClassCapabilityfor arbitrary classes (GPUs, NICs, VFs) instead of being limited toprobe_host_nvidia_vfio_readiness.PciBindGuard::companion_bdfs()accessor for consumer-side persistence of grouped bindings (crash-recovery state, status reporting).VfioError::GroupMismatchandVfioError::EmptyGroupfor typed validation responses.bind_device_to_vfioto cleardriver_overrideand re-probe the host driver ondrivers_probefailure and on post-probe polling timeout. Previously a failed bind could leave the device wedged withdriver_override="vfio-pci"pinned on disk, causing the next probe event to silently re-bind to vfio-pci.Testing
mise run pre-commitpassescargo test -p openshell-vfiopasses (52/52, up from 32)cargo clippy -p openshell-vfio --all-targets -- -D warningscleancargo check -p openshell-driver-vmclean (consumer crate compiles unchanged)Checklist
docs/coversopenshell-vfiotoday